Discovery of Power-Laws in Chemical Space
نویسندگان
چکیده
Power-law distributions have been observed in a wide variety of areas. To our knowledge however, there has been no systematic observation of power-law distributions in chemoinformatics. Here, we present several examples of power-law distributions arising from the features of small, organic molecules. The distributions of rigid segments and ring systems, the distributions of molecular paths and circular substructures, and the sizes of molecular similarity clusters all show linear trends on log-log rank/ frequency plots, suggesting underlying power-law distributions. The number of unique features also follow Heaps'-like laws. The characteristic exponents of the power-laws lie in the 1.5-3 range, consistently with the exponents observed in other power-law phenomena. The power-law nature of these distributions leads to several applications including the prediction of the growth of available data through Heaps' law and the optimal allocation of experimental or computational resources via the 80/20 rule. More importantly, we also show how the power-laws can be leveraged to efficiently compress chemical fingerprints in a lossless manner, useful for the improved storage and retrieval of molecules in large chemical databases.
منابع مشابه
A Characterization of the Entropy--Gibbs Transformations
Let h be a finite dimensional complex Hilbert space, b(h)+ be the set of all positive semi-definite operators on h and Phi is a (not necessarily linear) unital map of B(H) + preserving the Entropy-Gibbs transformation. Then there exists either a unitary or an anti-unitary operator U on H such that Phi(A) = UAU* for any B(H) +. Thermodynamics, a branch of physics that is concerned with the study...
متن کاملEnergy and Exergo-Economic Assessments of Gas Turbine Based CHP Systems: A Case Study of SPGC Utility Plant
Combined heat and power systems are becoming more and more important, regarding their enhanced efficiency, energy saving, and environmental aspects. In the peresent study, three configurations of combined heat and power systems are intended as an alternative to separate production plant by considering environmental aspects. First and second laws of thermodynamics are adapted to the operatin...
متن کاملجستاری در شناخت بازتاب فضایی عملکرد بازیگران سیاسی در چارچوب مکتب پدیدارشناسی هرمنوتیک
Extended abstract Introduction In the human science, one concept may be having some definition or even may be this narration was contradiction with each other in different philosophical schools. Therefore, explanation of one concept or relationship in different cognition schools has very great importance. From philosophical aspects in human science, theoretical structure has very fundamen...
متن کاملThe Search for Regularity: Four Aspects of Scientific Discovery
kicntific discovery is a complcx activity involving many diffcrcnt componcnts. Our intcrcst in discovery has led us to construct four A I systcms that addrcss diffcrcnt faccts of this process. BhCON.6 focuses on discovcring cmpirical laws that summarize numcncal data. This program scarchcs a space of data and a space of numcrical laws, and include methods for postulating intrinsic propcrtics an...
متن کاملLow Statutory Power of the Central Bank of Islamic Republic of Iran
Once the responsibilities of central bank increases, developing good governance for achieving its different aims to satisfy the required statutory power becomes more complicated. In the case of the central bank of Islamic republic of Iran - as monetary policymaker and supervisor-this issue is valid as well. Considering the necessity of independency, accountability, and transparency for developi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of chemical information and modeling
دوره 48 6 شماره
صفحات -
تاریخ انتشار 2008